DENCLUE 2.0: Fast Clustering Based on Kernel Density Estimation

نویسندگان

  • Alexander Hinneburg
  • Hans-Henning Gabriel
چکیده

The Denclue algorithm employs a cluster model based on kernel density estimation. A cluster is defined by a local maximum of the estimated density function. Data points are assigned to clusters by hill climbing, i.e. points going to the same local maximum are put into the same cluster. A disadvantage of Denclue 1.0 is, that the used hill climbing may make unnecessary small steps in the beginning and never converges exactly to the maximum, it just comes close. We introduce a new hill climbing procedure for Gaussian kernels, which adjusts the step size automatically at no extra costs. We prove that the procedure converges exactly towards a local maximum by reducing it to a special case of the expectation maximization algorithm. We show experimentally that the new procedure needs much less iterations and can be accelerated by sampling based methods with sacrificing only a small amount of accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Study of Density based Clustering Algorithms

This paper presents a comparative study of three Density based Clustering Algorithms that are DENCLUE, DBCLASD and DBSCAN. Six parameters are considered for their comparison. Result is supported by firm experimental evaluation. This analysis helps in finding the appropriate density based clustering algorithm in variant situations. General Terms Algorithms .

متن کامل

Fast Estimation of Nonparametric Kernel Density Through PDDP, and its Application in Texture Synthesis

In thiswork, anewalgorithm isproposed for fast estimationofnonparametricmultivariate kernel density, based on principal direction divisive partitioning (PDDP) of the data space.The goal of the proposed algorithm is to use the finite support property of kernels for fast estimation of density. Compared to earlier approaches, this work explains the need of using boundaries (for partitioning the sp...

متن کامل

Comparison of the Gamma kernel and the orthogonal series methods of density estimation

The standard kernel density estimator suffers from a boundary bias issue for probability density function of distributions on the positive real line. The Gamma kernel estimators and orthogonal series estimators are two alternatives which are free of boundary bias. In this paper, a simulation study is conducted to compare small-sample performance of the Gamma kernel estimators and the orthog...

متن کامل

An Efficient Approach to Clustering in Large Multimedia Databases with Noise

Several clustering algorithms can be applied to clustering in large multimedia databases. The effectiveness and efficiency of the existing algorithms, however, is somewhat limited, since clustering in multimedia databases requires clustering high-dimensional feature vectors and since multimedia databases often contain large amounts of noise. In this paper, we therefore introduce a new algorithm...

متن کامل

An E cient Approach to Clustering in Large Multimedia Databaseswith

Several clustering algorithms can be applied to clustering in large multimedia databases. The eeectiveness and eeciency of the existing algorithms, however, is somewhat limited, since clustering in multimedia databases requires clustering high-dimensional feature vectors and since multimedia databases often contain large amounts of noise. In this paper , we therefore introduce a new algorithm t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007